Learning Q-Function Approximations for Hybrid Control Problems

نویسندگان

چکیده

The main challenge in controlling hybrid systems arises from having to consider an exponential number of sequences future modes make good long-term decisions. Model predictive control (MPC) computes a action through finite-horizon optimisation problem. A key ingredient this problem is terminal cost, account for the system's evolution beyond chosen horizon. cost can reduce horizon length required and often tuned empirically by observing performance. We build on idea using N-step Q-functions ( Q (N) ) MPC objective avoid choose cost. present formulation incorporating system dynamics constraints approximate optimal -function algorithms train approximation parameters exploration state space. test policy derived trained approximations two benchmark problems simulations observe that our are able learn -approximations with dimensions practical relevance based relatively small data-set. compare controller's performance against Hybrid terms computation time closed-loop costs.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P14: Anxiety Control Using Q-Learning

Anxiety disorders are the most common reasons for referring to specialized clinics. If the response to stress changed, anxiety can be greatly controlled. The most obvious effect of stress occurs on circulatory system especially through sweating. the electrical conductivity of skin or in other words Galvanic Skin Response (GSR) which is dependent on stress level is used; beside this parameter pe...

متن کامل

Q-Learning for Bandit Problems

Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP's) with potentially very large state sets. A particularly elegant methodology for computing optimal policies was developed over twenty ago by Gittins Gittins & Jones, 1974]. Gittins' approach reduces the problem of nding optimal policies for the original MDP to a sequence of low-dimensional stopping...

متن کامل

Tight Approximations for the Two Dimensional Gaussian $Q-$function

The aim of this work is the derivation of two approximated expressions for the two dimensional Gaussian Q-function, Q(x, y; ρ). These expressions are highly accurate and are expressed in closedform. Furthermore, their algebraic representation is relatively simple and therefore, convenient to handle both analytically and numerically. This feature is particularly useful for two reasons: firstly b...

متن کامل

Q-Learning for Robot Control

Q-Learning is a method for solving reinforcement learning problems. Reinforcement learning problems require improvement of behaviour based on received rewards. Q-Learning has the potential to reduce robot programming effort and increase the range of robot abilities. However, most currentQ-learning systems are not suitable for robotics problems: they treat continuous variables, for example speed...

متن کامل

An Online Learning Control Strategy for Hybrid Electric Vehicle Based on Fuzzy Q-Learning

In order to realize the online learning of a hybrid electric vehicle (HEV) control strategy, a fuzzy Q-learning (FQL) method is proposed in this paper. FQL control strategies consists of two parts: The optimal action-value function Q*(x,u) estimator network (QEN) and the fuzzy parameters tuning (FPT). A back propagation (BP) neural network is applied to estimate Q*(x,u) as QEN. For the fuzzy co...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE control systems letters

سال: 2022

ISSN: ['2475-1456']

DOI: https://doi.org/10.1109/lcsys.2021.3094764